24 research outputs found

    Efficient target-response interpolation for a graphic equalizer

    Get PDF
    Proceedings of the 41st IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP, held in Shanghai (China) during 20-25 March 2016.A graphic equalizer is an adjustable filter in which the command gain of each frequency band is practically independent of the gains of other bands. Designing a graphic equalizer with a high precision requires evaluating a target response that interpolates the magnitude response at several frequency points between the command gains. Good accuracy has been previously achieved by using polynomial interpolation methods such as cubic Hermite or spline interpolation. However, these methods require large computational resources, which is a limitation in real-time applications. This paper proposes an efficient way of computing the target response without sacrificing the approximation accuracy. This new approach called Linear Interpolation with Constant Segments (LICS) reduces the computing time of the target response by 55% and has an intrinsic parallel structure. Performance of the LICS method is assessed on an ARM Cortex-A7 core, which is commonly used in embedded systems.This work was conducted in spring 2015 when the first author was a visiting postdoctoral researcher at Aalto University. This research has been partly funded by the TIN2014-53495-R and TIN2011-23283 projects of the Ministerio de Economía y Competitividad and FEDER

    An Efficient Implementation of Parallel Parametric HRTF Models for Binaural Sound Synthesis in Mobile Multimedia

    Get PDF
    The extended use of mobile multimedia devices in applications like gaming, 3D video and audio reproduction, immersive teleconferencing, or virtual and augmented reality, is demanding efficient algorithms and methodologies. All these applications require real-time spatial audio engines with the capability of dealing with intensive signal processing operations while facing a number of constraints related to computational cost, latency and energy consumption. Most mobile multimedia devices include a Graphics Processing Unit (GPU) that is primarily used to accelerate video processing tasks, providing high computational capabilities due to its inherent parallel architecture. This paper describes a scalable parallel implementation of a real-time binaural audio engine for GPU-equipped mobile devices. The engine is based on a set of head-related transfer functions (HRTFs) modelled with a parametric parallel structure, allowing efficient synthesis and interpolation while reducing the size required for HRTF data storage. Several strategies to optimize the GPU implementation are evaluated over a well-known kind of processor present in a wide range of mobile devices. In this context, we analyze both the energy consumption and real-time capabilities of the system by exploring different GPU and CPU configuration alternatives. Moreover, the implementation has been conducted using the OpenCL framework, guarantying the portability of the code

    Strategies to parallelize a finite element mesh truncation technique on multi-core and many-core architectures

    Get PDF
    Achieving maximum parallel performance on multi-core CPUs and many-core GPUs is a challenging task depending on multiple factors. These include, for example, the number and granularity of the computations or the use of the memories of the devices. In this paper, we assess those factors by evaluating and comparing different parallelizations of the same problem on a multiprocessor containing a CPU with 40 cores and four P100 GPUs with Pascal architecture. We use, as study case, the convolutional operation behind a non-standard finite element mesh truncation technique in the context of open region electromagnetic wave propagation problems. A total of six parallel algorithms implemented using OpenMP and CUDA have been used to carry out the comparison by leveraging the same levels of parallelism on both types of platforms. Three of the algorithms are presented for the first time in this paper, including a multi-GPU method, and two others are improved versions of algorithms previously developed by some of the authors. This paper presents a thorough experimental evaluation of the parallel algorithms on a radar cross-sectional prediction problem. Results show that performance obtained on the GPU clearly overcomes those obtained in the CPU, much more so if we use multiple GPUs to distribute both data and computations. Accelerations close to 30 have been obtained on the CPU, while with the multi-GPU version accelerations larger than 250 have been achieved.Funding for open access charge: CRUE-Universitat Jaume

    Evaluating the soft error sensitivity of a GPU-based SoC for matrixmultiplication

    Get PDF
    System-on-Chip (SoC) devices can be composed of low-power multicore processors combined with a small graphics accelerator (or GPU) which offers a trade-off between computational capacity and low-power consumption. In this work we use the LLFI-GPU fault injection tool on one of these devices to compare the sensitivity to soft errors of two different CUDA versions of matrix multiplication benchmark. Specifically, we perform fault injection campaigns on a Jetson TK1 development kit, a board equipped with a SoC including an NVIDIA ”Kepler“ Graphics Processing Unit (GPU). We evaluate the effect of modifying the size of the problem and also the thread-block size on the behaviour of the algorithms. Our results show that the block version of the matrix multiplication benchmark that leverages the shared memory of the GPU is not only faster than the element-wise version, but it is also much more resilient to soft errors. We also use the cuda-gdb debugger to analyze the main causes of the crashes in the code due to soft errors. Our experiments show that most of the errors are due to accesses to invalid positions of the different memories of the GPU, which causes that the block version suffers a higher percentage of this kind of errors

    Exposición mediante realidad virtual para el TOC: ¿Es factible?

    Get PDF
    Virtual reality exposure therapy (VRET) is receiving increased attention, especially in the fields of anxiety and eating disorders. This study is the first trial examining the utility of VRET from the perspective of OCD patients. Four OCD women assessed the sense of presence, emotional engagement, and reality judgment, and the anxiety and disgust levels they experimented in four scenarios, called the Contaminated Virtual Environment (COVE), in which they had to perform several activities. The COVE scenarios were presented on a Full HD 46” TV connected to a laptop and to a Kinect device. Results indicate that the COVE scenarios generated a good sense of presence. The anxiety and disgust levels increased as the virtual contamination increased, and the anxiety produced was related to the emotional engagement and sense of presence.La Exposición mediante Realidad Virtual (ERV) está recibiendo una atención cada vez mayor, especialmente para los trastornos de ansiedad y los alimentarios. Este estudio es el primer ensayo que evalúa la utilidad de la ERV desde la propia perspectiva de pacientes con Trastorno Obsesivo-Compulsivo (TOC). Cuatro mujeres con TOC evaluaron la sensación de presencia, implicación emocional, el juicio de realidad, y los niveles de ansiedad y asco que experimentaban en cuatro escenarios virtuales, que denominamos Entorno Virtual Contaminado (EVCO), en los que debían realizar varias actividades. Los escenarios se presentaron en una TV Full HD de 46’’, conectada a un ordenador y a un dispositivo Kinect. Los resultados indican que EVCO produjo una buena sensación de presencia. Los niveles de ansiedad y asco aumentaron a medida que aumentaba la “contaminación” de los escenarios, y la ansiedad se asoció con la sensación de presencia y la implicación emocional

    Evaluating the computational performance of the Xilinx Ultrascale plus EG Heterogeneous MPSoC

    Get PDF
    The emergent technology of Multi-Processor System-on-Chip (MPSoC), which combines heterogeneous computing with the high performance of Field Programmable Gate Arrays (FPGAs) is a very interesting platform for a huge number of applications ranging from medical imaging and augmented reality to high-performance computing in space. In this paper, we focus on the Xilinx Zynq UltraScale+ EG Heterogeneous MPSoC, which is composed of four different processing elements (PE): a dual-core Cortex-R5, a quad-core ARM Cortex-A53, a graphics processing unit (GPU) and a high end FPGA. Proper use of the heterogeneity and the different levels of parallelism of this platform becomes a challenging task. This paper evaluates this platform and each of its PEs to carry out fundamental operations in terms of computational performance. To this end, we evaluate image-based applications and a matrix multiplication kernel. On former, the image-based applications leverage the heterogeneity of the MPSoc and strategically distributes its tasks among both kinds of CPU cores and the FPGA. On the latter, we analyze separately each PE using different matrix multiplication benchmarks in order to assess and compare their performance in terms of MFlops. This kind of operations are being carried out for example in a large number of space-related applications where the MPSoCs are currently gaining momentum. Results stand out the fact that different PEs can collaborate efficiently with the aim of accelerating the computational-demanding tasks of an application. Another important aspect to highlight is that leveraging the parallel OpenBLAS library we achieve up to 12 GFlops with the four Cortex-A53 cores of the platform, which is a considerable performance for this kind of devices

    Real-time massive convolution for audio applications on GPU

    Full text link
    [EN] Massive convolution is the basic operation in multichannel acoustic signal processing. This field has experienced a major development in recent years. One reason for this has been the increase in the number of sound sources used in playback applications available to users. Another reason is the growing need to incorporate new effects and to improve the hearing experience. Massive convolution requires high computing capacity. GPUs offer the possibility of parallelizing these operations. This allows us to obtain the processing result in much shorter time and to free up CPU resources. One important aspect lies in the possibility of overlapping the transfer of data from CPU to GPU and vice versa with the computation, in order to carry out real-time applications. Thus, a synthesis of 3D sound scenes could be achieved with only a peer-to-peer music streaming environment using a simple GPU in your computer, while the CPU in the computer is being used for other tasks. Nowadays, these effects are obtained in theaters or funfairs at a very high cost, requiring a large quantity of resources. Thus, our work focuses on two mains points: to describe an efficient massive convolution implementation and to incorporate this task to real-time multichannel-sound applications. © 2011 Springer Science+Business Media, LLC.This work was partially supported by the Spanish Ministerio de Ciencia e Innovacion (Projects TIN2008-06570-C04-02 and TEC2009-13741), Universidad Politecnica de Valencia through PAID-05-09 and Generalitat Valenciana through project PROMETEO/2009/2013Belloch Rodríguez, JA.; Gonzalez, A.; Martínez Zaldívar, FJ.; Vidal Maciá, AM. (2011). Real-time massive convolution for audio applications on GPU. Journal of Supercomputing. 58(3):449-457. https://doi.org/10.1007/s11227-011-0610-8S449457583Spors S, Rabenstein R, Herbordt W (2007) Active listening room compensation for massive multichannel sound reproduction system using wave-domain adaptive filtering. J Acoust Soc Am 122:354–369Huang Y, Benesty J, Chen J (2008) Generalized crosstalk cancellation and equalization using multiple loudspeakers for 3D sound reproduction at the ears of multiple listeners. In: IEEE int conference on acoustics, speech and signal processing, Las Vegas, USA, pp 405–408Cowan B, Kapralos B (2008) Spatial sound for video games and virtual environments utilizing real-time GPU-based convolution. In: Proceedings of the ACM FuturePlay 2008 international conference on the future of game design and technology, Toronto, Ontario, Canada, November 3–5Belloch JA, Vidal AM, Martinez-Zaldivar FJ, Gonzalez A (2010) Multichannel acoustic signal processing on GPU. In: Proceedings of the 10th international conference on computational and mathematical methods in science and engineering, vol 1. Almeria, Spain, June 26–30, pp 181–187Cowan B, Kapralos B (2009) GPU-based one-dimensional convolution for real-time spatial sound generation. Sch J 3(5)Soliman SS, Mandyam DS, Srinath MD (1997) Continuous and discrete signals and systems. Prentice Hall, New YorkOppenheim AV, Willsky AS, Hamid Nawab S (1996) Signals and systems. Prentice Hall, New YorkopenGL: http://www.opengl.org/MKL library: http://software.intel.com/en-us/intel-mkl/MKL library: http://software.intel.com/en-us/intel-ipp/CUFFT library: http://developer.download.nvidia.com/compute/cuda/3_1/toolkit/docs/CUFFT_Library_3.1.pdfCUDA Toolkit 3.1: http://developer.nvidia.com/object/cuda_3_1_downloads.htmlCUDA Toolkit 3.2: http://developer.nvidia.com/object/cuda_3_1_downloads.htmlDatasheet of AC’97 SoundMAX Codec: http://www.xilinx.com/products/boards/ml505/datasheets/87560554AD1981B_c.pd

    Audiovisual Tool for understanding Audio concepts for being used in bachelor’s degree programmes

    Full text link
    [EN] In the Audio Signal Processing field, there exists difficulties in order to explain different concepts such as, compression, masking, quantization, sampling, among others. Further, most of these concepts require the use of audio laboratories and multiple practical session that must carry out students. Another issue is that there are students that are not able to internalize these concepts straightforwardly and require more practical sessions. In order to address these problems, we have developed an audiovisual tool, designed with Matlab, that can be used for professors and students. This tool allows to analyze, test and apply the audio concepts to real audio signals. The developed tool has been successfully experienced by professors of the audio signal processing field that recommend its use in upcoming academic courses.This research has been partly funded by TIN2014-53495-R, BES-2013-063783, BES-2013- 065034, TEC2013-47141-C4-4-R and FPU AP-2012/71274.Antoñanzas Manuel, C.; Gutiérrez Parera, P.; Simarro Haro, MDLA.; Belloch, JA. (2016). Audiovisual Tool for understanding Audio concepts for being used in bachelor’s degree programmes. En 2nd. International conference on higher education advances (HEAD'16). Editorial Universitat Politècnica de València. 495-502. https://doi.org/10.4995/HEAD16.2016.2923OCS49550

    Accelerating multi-channel filtering of audio signal on ARM processors

    Get PDF
    The researchers from Universitat Jaume I are supported by the CICYT projects TIN2014-53495-R and TIN2011-23283 of the Ministerio de Economía y Competitividad and FEDER. The authors from the Universitat Politècnica de València are supported by projects TEC2015-67387-C4-1-R and PROMETEOII/2014/003. This work was also supported from the European Union FEDER (CAPAP-H5 network TIN2014-53522-REDT)

    Hybrid CPU-GPU implementation of the transformed spatial domain channel estimation algorithm for mmWave MIMO systems

    Get PDF
    Hybrid platforms combining multicore central processing units (CPU) with manycore hardware accelerators such as graphic processing units (GPU) can be smartly exploited to provide efcient parallel implementations of wireless communication algorithms for Fifth Generation (5G) and beyond systems. Massive multiple-input multiple-output (MIMO) systems are a key element of the 5G standard, involving several tens or hundreds of antenna elements for communication. Such a high number of antennas has a direct impact on the computational complexity of some MIMO signal processing algorithms. In this work, we focus on the channel estimation stage. In particular, we develop a parallel implementation of a recently proposed MIMO channel estimation algorithm. Its performance in terms of execution time is evaluated both in a multicore CPU and in a GPU. The results show that some computation blocks of the algorithm are more suitable for multicore implementation, whereas other parts are more efciently implemented in the GPU, indicating that a hybrid CPU-GPU implementation would achieve the best performance in practical applications based on the tested platform
    corecore